Skip to content

Report hardware config errors to PtpConfig status conditions#130

Closed
gtannous-spec wants to merge 2 commits intok8snetworkplumbingwg:mainfrom
gtannous-spec:hwconfig-typo
Closed

Report hardware config errors to PtpConfig status conditions#130
gtannous-spec wants to merge 2 commits intok8snetworkplumbingwg:mainfrom
gtannous-spec:hwconfig-typo

Conversation

@gtannous-spec
Copy link
Copy Markdown
Collaborator

@gtannous-spec gtannous-spec commented Dec 29, 2025

Hardware Plugin Validation & Status Reporting

Summary

This PR adds runtime validation for hardware plugin names in PtpConfig and reports configuration errors directly to the PtpConfig CR status. When users accidentally misconfigure plugin names (e.g., typo e81 instead of e810), the daemon now:

  1. Detects the invalid plugin name at runtime
  2. Reports a clear warning to the PtpConfig status via HardwarePluginConfigurationWarning condition
  3. Automatically clears the warning when the configuration is corrected

Problem

Previously, if a user made a typo in the hardware plugin configuration section of a PtpConfig, the daemon would silently fail to configure the hardware but continue running. This made debugging configuration issues difficult.

Example of a typo that would go unnoticed:

plugins:
  e81:  # Typo! Should be "e810"
    enableDefaultConfig: false

Solution

The PluginManager now validates plugin names against registered plugins and uses a dynamic client to update the PtpConfig status when errors are found.

New Status Condition

status:
  conditions:
    - type: HardwarePluginConfigurationWarning
      status: "True"
      reason: InvalidPluginName
      message: "Profile 'my-profile' contains unrecognized hardware plugin(s): [e81]. Valid plugins are: [e810 e825 e830 ntpfailover]"

Files Changed

File Changes
pkg/plugin/plugin.go Added ValidateAndReportPluginErrors() method to PluginManager with dynamic client status updates
pkg/plugin/plugin_test.go Unit tests for ValidateProfilePlugins function
pkg/daemon/daemon.go Integrated validation call after profiles are applied; passes RestConfig to PluginManager
pkg/daemon/daemon_internal_test.go Updated test calls to include new restConfig parameter
test/ptpconfig-invalid-plugin-test.yaml Test manifest with invalid plugin name for integration testing
test/ptpconfig-valid-plugin-test.yaml Test manifest with valid plugin name for comparison

Implementation Details

PluginManager Changes (pkg/plugin/plugin.go)

  • Added RestConfig and Namespace fields for dynamic client creation
  • New methods:
    • ValidateAndReportPluginErrors() - Main validation entry point
    • updatePtpConfigStatusWithPluginError() - Adds warning condition
    • clearPtpConfigPluginWarning() - Removes warning when fixed
    • removePluginWarningCondition() - Helper for condition removal

Daemon Integration (pkg/daemon/daemon.go)

  • Validation runs at the end of applyNodePTPProfiles() after profiles are applied
  • PluginManager receives RestConfig and Namespace during daemon initialization

Testing

Unit Tests

go test ./pkg/plugin/... -v

Integration Testing

  1. Apply invalid config:
kubectl apply -f test/ptpconfig-invalid-plugin-test.yaml
  1. Check for warning condition:
kubectl get ptpconfig test-invalid-plugin -n openshift-ptp -o jsonpath='{.status.conditions}' | jq
  1. Expected output:
[{
  "type": "HardwarePluginConfigurationWarning",
  "status": "True",
  "reason": "InvalidPluginName",
  "message": "Profile 'test-invalid-plugin' contains unrecognized hardware plugin(s): [e81]. Valid plugins are: [e810 e825 e830 ntpfailover]"
}]
  1. Fix the plugin name and verify warning is cleared:
kubectl patch ptpconfig test-invalid-plugin -n openshift-ptp --type='json' \
  -p='[{"op": "replace", "path": "/spec/profile/0/plugins", "value": {"e810": {"enableDefaultConfig": false}}}]'

# Wait ~30s for next config cycle, then verify condition is removed
kubectl get ptpconfig test-invalid-plugin -n openshift-ptp -o jsonpath='{.status.conditions}'

Notes

  • Uses Kubernetes dynamic client to avoid requiring API changes in ptp-operator
  • Validation occurs when profiles are applied (not on a periodic ticker)
  • Valid plugin names are determined by registered plugins in PluginManager

@gtannous-spec gtannous-spec added the ok-to-test ok to test label Dec 29, 2025
@github-actions
Copy link
Copy Markdown

Thanks for your PR,
Best regards.

@github-actions github-actions bot removed the ok-to-test ok to test label Dec 31, 2025
@gtannous-spec gtannous-spec added the ok-to-test ok to test label Dec 31, 2025
@github-actions github-actions bot removed the ok-to-test ok to test label Dec 31, 2025
@gtannous-spec gtannous-spec added the ok-to-test ok to test label Dec 31, 2025
@github-actions github-actions bot removed the ok-to-test ok to test label Dec 31, 2025
@gtannous-spec gtannous-spec added the ok-to-test ok to test label Dec 31, 2025
@gtannous-spec gtannous-spec marked this pull request as ready for review January 6, 2026 16:08
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can changes in here be condensed at all?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean adding all the file changed to be condensed in pkg/plugin/plugin.go ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean does it need to be a 500 line change? Or is this something that overall can be done more concisely?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simpler way is letting daemon write the failure status in the ptpconfig.
However, the ptpconfig controller is disabled in the linuxptp daemon. So I'm trying to use dynamic client here to write to the status of the ptpconfig using the dynamic client library function UpdateStatus and just skipping the controller which is imo complicating it to 500 line change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrated the logic to PR#152
It is slightly more condensed, if that makes more sense :)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrated the logic to PR#152 It is slightly more condensed, if that makes more sense :)

Ack, will close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants